Lab 02 - Take a sad plot and make it better

Image by Fauxels from Pexels Image by Fauxels from Pexels

Your task in this lab is to improve a plot that violates many data visualization best practices. We want you to get creative and make a visualisation that tells a (much!) better story than the original plot.

Learning goals

Complete the following steps during the workshop with your team.

We will also be going through the steps of how you can working collaboratively in your teams in preparation for your projects. Please read the instructions carefully for who needs to do what and when, and ask a tutor for assistance if you are stuck.

Warm up with your team

Take 5 minutes to go around the team, each pointing out one error in the following visual.

Once you are done with this, give a number to each team member. If you want to assign numbers at random, you can go to the Random.org sequence generator, and assign the first number that appears to the person whose name is first in the alphabet, and so on. Skip over any numbers that are larger than the size of your team. In this lab, team members will take turns sharing their screen and working on an exercise in the common team repo, commit and push their changes, and then the next team member will take over and pull the changes before they make any further changes to their lab. In the lab instructions you will see markers for

Getting started

Repository

TEAM MEMBER 1:

TEAM MEMBERS 2+:

EVERYONE:

Adding your name

Important: For the next few steps, only one person at a time should be doing the following steps. Everyone else should take their hands off their computer and do not jump ahead! If you come across a problem (specifically a merger conflict) then raise your hand for a tutor to help.

TEAM MEMBER 1:

TEAM MEMBER 2:

OTHER TEAM MEMBERS:

EVERYONE:

Working collaboratively!

Congratulations, you have now started working collaboratively from the same repository in GitHub. This will be extremely useful in your projects for sharing out the workload amongst team members.

GitHub is very smart when merging the repository in GitHub with the updated version you Push from RStudio. Typically, each team member works in different locations in the repository and there should not be any major merger issues. However, each team member in the above instructions are changing the same line in the R Markdown document and so it was important for only one member does their task at any one time. Otherwise you would have created a Merger Conflict (ask a tutor for help if this happens to your team). We will discuss how to resolve merger conflicts in the next lab.

Packages

EVERYONE: Before getting started with the Exercises, run the following code in the Console to load this package.

library(tidyverse)

Exercises

TEAM MEMBER 1 should write the answer to Exercises 1, and then commit and push their changes. Everyone else: participate, help out, but no typing in the R Markdown document and no committing/pushing!

  1. If the long data will have a row for each year/faculty type combination, and there are 5 faculty types and 11 years of data, how many rows will the data have? Discuss as a team and write down your answer.

🧶 ✅ ⬆️ At this point TEAM MEMBER 1 should knit the Rmd, stage, commit, and push their changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.


TEAM MEMBER 2 should now pull ⬇️ before doing anything else. They should then write the answers to Exercises 2 and 3, and then commit and push their changes. Everyone else: participate, help out, but no typing in the R Markdown document and no committing/pushing!

We do the wide to long conversation using pivot_longer(). The animation below show how this function works, as well as its counterpart pivot_wider().

Quick reminder: the function has the following arguments:

pivot_longer(data, cols, names_to = "name")
  1. Fill in the blanks in the following code chunk to pivot the staff data longer and save it as a new data frame called staff_long.
staff_long <- ___ %>%
  ___(
    cols = ___, 
    names_to = "___",
    values_to = "___"
    )
  1. Inspect staff_long to check if your guess regarding number of rows from Exercise 1 was correct.

🧶 ✅ ⬆️ At this point TEAM MEMBER 2 should knit the Rmd, commit, and push their changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.


TEAM MEMBER 3 should now pull ⬇️ before doing anything else. They should then write the answers to Exercises 4 and 5, and then commit and push their changes. Everyone else: participate, help out, but no typing in the R Markdown document and no committing/pushing!

  1. We will plot instructional staff employment trends as a line plot. A possible approach for creating a line plot where we colour the lines by faculty type is the following, but it does not quite look right. What is wrong with the graph? (You do not need to say how to fix it here—that is the next question!)
staff_long %>%
  ggplot(aes(x = year, y = value, color = faculty_type)) +
  geom_line()
  1. Next, add a group aesthetic to the plot (grouping by faculty_type) and plot again. What does the plot reveal about instructional staff employment trends over the years?

🧶 ✅ ⬆️ At this point TEAM MEMBER 3 should knit the Rmd, commit, and push their changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.


TEAM MEMBER 4 should now pull ⬇️ before doing anything else. They should then write the answers to Exercise 6, and then commit and push their changes. Everyone else: participate, help out, but no typing in the R Markdown document and no committing/pushing! (If your team has fewer than 4 people, just move back to the first member.) If there is no TEAM MEMBER 4 then cycle back round to TEAM MEMEBER 1.

  1. Improve the line plot from the previous exercise by fixing up its labels (title, axis labels, and legend label) as well as any other components you think could benefit from improvement.

🧶 ✅ ⬆️ At this point TEAM MEMBER 4 should knit the Rmd, commit, and push their changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.


The next team member should now pull ⬇️ before doing anything else. They should then write the answers to Exercise 7 and 8, and then commit and push their changes. Everyone else: participate, help out, but no typing in the R Markdown document and no committing/pushing! If someone in your team is participating remotely, remember to share screens.

  1. Suppose the objective of this plot was to show that the proportion of part-time faculty have gone up over time compared to other instructional staff types. What changes would you propose making to this plot to tell this story? Write down your idea(s). The more precise you are, the easier the next step will be. Get creative, and think about how you can modify the dataset to give you new/different variables to work with.
  2. Implement at least one of these ideas you came up with in the previous exercise. You should produce an improved data visualisation and accompany your visualisation with a brief paragraph describing the choices you made in your improvement, specifically discussing what you didn’t like in the original plot and why, and how you addressed them in the visualisation you created.

🧶 ✅ ⬆️ At this point the team member should knit the Rmd, commit, and push their changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

Aim to make it to this point during the workshop.

Wrapping up

Go back through your write up to make sure you are following coding style guidelines we discussed in class. Make any edits as needed.

Also, make sure all of your R chunks are properly labelled and your figures are reasonably sized.

Once the last person pushes their final changes, others should pull the changes and knit the R Markdown document to confirm that they can reproduce the report.

Making and managing personal copies

TEAM MEMBERS 2+: The following steps will ensure that you have your own copy of today’s lab worksheet and maintain all of the communication between RStudio and GitHub so that you can attempt any outstanding exercises on your own time in your own version of the worksheet.

TEAM MEMBERS 1: After today’s lab, you may want to prevent others from making any further changes to the worksheet. - First ensure that your team members have pulled the latest version of the repository. - In the lab worksheet repository on GitHub, go to Settings and then Collaborators. - Remove your team members so that they can no longer push changes to the repository.

More sad plots

Want to see more sad plots?